543 research outputs found
Multivariate Bayesian semiparametric models for authentication of food and beverages
Food and beverage authentication is the process by which foods or beverages
are verified as complying with its label description, for example, verifying if
the denomination of origin of an olive oil bottle is correct or if the variety
of a certain bottle of wine matches its label description. The common way to
deal with an authentication process is to measure a number of attributes on
samples of food and then use these as input for a classification problem. Our
motivation stems from data consisting of measurements of nine chemical
compounds denominated Anthocyanins, obtained from samples of Chilean red wines
of grape varieties Cabernet Sauvignon, Merlot and Carm\'{e}n\`{e}re. We
consider a model-based approach to authentication through a semiparametric
multivariate hierarchical linear mixed model for the mean responses, and
covariance matrices that are specific to the classification categories.
Specifically, we propose a model of the ANOVA-DDP type, which takes advantage
of the fact that the available covariates are discrete in nature. The results
suggest that the model performs well compared to other parametric alternatives.
This is also corroborated by application to simulated data.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS492 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Simple Class of Bayesian Nonparametric Autoregression Models
We introduce a model for a time series of continuous outcomes, that can be expressed as fully nonparametric regression or density regression on lagged terms. The model is based on a dependent Dirichlet process prior on a family of random probability measures indexed by the lagged covariates. The approach is also extended to sequences of binary responses. We discuss implementation and applications of the models to a sequence of waiting times between eruptions of the Old Faithful Geyser, and to a dataset consisting of sequences of recurrence indicators for tumors in the bladder of several patients.MIUR 2008MK3AFZFONDECYT 1100010NIH/NCI R01CA075981Mathematic
Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis
A prespecified set of genes may be enriched, to varying degrees, for genes
that have altered expression levels relative to two or more states of a cell.
Knowing the enrichment of gene sets defined by functional categories, such as
gene ontology (GO) annotations, is valuable for analyzing the biological
signals in microarray expression data. A common approach to measuring
enrichment is by cross-classifying genes according to membership in a
functional category and membership on a selected list of significantly altered
genes. A small Fisher's exact test -value, for example, in this
table is indicative of enrichment. Other category analysis methods retain the
quantitative gene-level scores and measure significance by referring a
category-level statistic to a permutation distribution associated with the
original differential expression problem. We describe a class of random-set
scoring methods that measure distinct components of the enrichment signal. The
class includes Fisher's test based on selected genes and also tests that
average gene-level evidence across the category. Averaging and selection
methods are compared empirically using Affymetrix data on expression in
nasopharyngeal cancer tissue, and theoretically using a location model of
differential expression. We find that each method has a domain of superiority
in the state space of enrichment problems, and that both methods have benefits
in practice. Our analysis also addresses two problems related to
multiple-category inference, namely, that equally enriched categories are not
detected with equal probability if they are of different sizes, and also that
there is dependence among category statistics owing to shared genes. Random-set
enrichment calculations do not require Monte Carlo for implementation. They are
made available in the R package allez.Comment: Published at http://dx.doi.org/10.1214/07-AOAS104 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
1020 steel coated with Ti/TiN by Cathodic Arc and Ion Implantation
TiN coatings have been widely studied in order to improve mechanical properties of steels. In this work, thin Ti/TiN films were prepared by plasma based immersion ion implantation and deposition (PBII&D) with a cathodic arc on AISI 1020 steel substrates. Substrates were exposed to the discharge during 1 min in vacuum for the deposition of a Tiunderlayer with the aim of improving the adhesion to the substrate. Then, a TiN layer was deposited during 6 min in a nitrogen environment at a pressure of 3xl0-4 mbar. Samples were obtained at room temperature and at 300 °C, and with or without ion implantation in order to analyze differences between the effects of each treatment on the tribological properties. The mechanical and tribological properties of the films were characterized. The coatings deposited by PBII&D at 300 °C presented the highest hardness and young modulus, the best wear resistance and corrosion performance.Fil: Bermeo, Diego Fernando. Universidad Santiago de Cali; ColombiaFil: Quintana, Juan Pablo. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica del Plasma; ArgentinaFil: Kleiman, Ariel Javier. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica del Plasma; ArgentinaFil: Sequeda, F.. Universidad del Valle; ColombiaFil: Márquez, A.. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de FÃsica del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de FÃsica del Plasma; Argentin
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
- …